Use the image data/candy.jpg for this task!
Count the how many:
!pip install opencv-python
Requirement already satisfied: opencv-python in c:\users\balin\anaconda3\lib\site-packages (4.5.2.52) Requirement already satisfied: numpy>=1.17.3 in c:\users\balin\anaconda3\lib\site-packages (from opencv-python) (1.19.2)
import numpy as np
import matplotlib.pyplot as plt
import cv2
from skimage import io, morphology, measure
from matplotlib import colors
from collections import Counter
#plt.rcParams['text.usetex'] = True
img = cv2.imread("data/candy.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure(figsize=[6,10])
plt.imshow(img), img.shape
(<matplotlib.image.AxesImage at 0x27a794ffdf0>, (1069, 736, 3))
#It would be a waste of time wait for the visualiziation of all 700k pixels, so I'm going to draw some randomly.
pixel_pool = img[np.random.randint(0,1068, 1500), np.random.randint(0,735, 1500), :]
r, g, b = pixel_pool[:,0], pixel_pool[:,1], pixel_pool[:,2]
pixel_colors = pixel_pool.reshape(-1, 3)
norm = colors.Normalize(vmin=-1.,vmax=1.)
norm.autoscale(pixel_colors)
pixel_colors = norm(pixel_colors).tolist()
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(
go.Scatter3d(
x=r.flatten(),
y=g.flatten(),
z=b.flatten(),
mode='markers',
marker=dict( size=2, color=pixel_colors, opacity=0.8)
))
fig.update_layout(scene = dict(xaxis_title="red", yaxis_title="green", zaxis_title="blue"))
fig.update_layout(margin = dict(r=20, b=10, l=10, t=10)) #much of the picture is clipped otherwise
fig.show()
On the first figure above, there are some sprinkles in different colors. I can see sprinkles in 7 different colors, if I'm correct: yellow, green, blue, orange, white, red and pink. I'm going to use this intuition in next exercise to guess the number of colors on the picture.
Naturally, the color of the sprinkles are perceived to be the same, however, by taking a picture of them, the light reflected from them is not homogenious - as it can be seen on the second figure. So, solely looking at indivudal pixel values is not going to answer the question.
We need to group each pixel's color (by their RGB value), so that we can make some generalizations about the colors. One way to do is using the k-means algorithm.
img = img.reshape(-1,3) #flattening the image
from sklearn.cluster import KMeans
clustering = KMeans(n_clusters = 7)
clustering.fit(img)
clustering.cluster_centers_ , clustering.cluster_centers_.reshape(1,-1,3)
#These values are not integers, so we have to convert them to get
(array([[191.07103622, 55.11946882, 39.22615999],
[ 28.08326395, 141.40430323, 189.06046098],
[233.54722941, 229.18710475, 226.55010944],
[ 64.05977093, 40.52298083, 23.56733882],
[ 48.99729733, 152.14812849, 46.30403179],
[236.93753881, 173.91070235, 32.10918524],
[207.18112444, 151.14793696, 142.67353564]]),
array([[[191.07103622, 55.11946882, 39.22615999],
[ 28.08326395, 141.40430323, 189.06046098],
[233.54722941, 229.18710475, 226.55010944],
[ 64.05977093, 40.52298083, 23.56733882],
[ 48.99729733, 152.14812849, 46.30403179],
[236.93753881, 173.91070235, 32.10918524],
[207.18112444, 151.14793696, 142.67353564]]]))
rounded_kmeans = clustering.cluster_centers_.reshape(1,-1,3).astype(int)
plt.figure(figsize=[6,3])
plt.imshow(rounded_kmeans)
plt.title("Sprinkle colors: k-means, $k=7$",size=14)
Text(0.5, 1.0, 'Sprinkle colors: k-means, $k=7$')
sorted(Counter(clustering.labels_).most_common())
[(0, 152992), (1, 52872), (2, 165183), (3, 100682), (4, 81007), (5, 127687), (6, 106361)]
labels = clustering.labels_.reshape(1069, 736)
count_dict = {}
for i in np.unique(clustering.labels_):
blobs = np.int_(morphology.binary_opening(labels == i))
color = np.around(clustering.cluster_centers_[i])
count = len(np.unique(measure.label(blobs))) - 1
count_dict[i] = [count, color]
plt.figure(figsize=[10,5])
plt.imshow(rounded_kmeans)
plt.title("Number of candies by color: k-means, $k=7$, ",size=14)
for i in range(7):
plt.text(i,0,count_dict[i][0], color ="black", size=13, ha="center",va="center", fontweight='bold' )
Based on the figure above, we can say, that the k-means algorithm did a pretty good job by segmenting the picture by colors. Even though most of the colors seem to be accurate, there are two things going sideways. There's no brown candy in the image - so number 5 shouldn't be brown neither. Moreover, the first color seems to be the mixture of yellow and orange.
clustering = KMeans(n_clusters = 8)
clustering.fit(img)
rounded_kmeans = clustering.cluster_centers_.reshape(1,-1,3).astype(int)
plt.figure(figsize=[6,3])
plt.imshow(rounded_kmeans)
plt.title("Sprinkle colors: k-means, $k=8$ ",size=14)
Text(0.5, 1.0, 'Sprinkle colors: k-means, $k=8$ ')
Fortunately, by increasing to $k=8$ now the algorithm detects the centroid of the yellow colored sprinkles.
sorted(Counter(clustering.labels_).most_common()) #Number of pixels assigned to each centroid
[(0, 78091), (1, 169863), (2, 76280), (3, 92382), (4, 122824), (5, 96861), (6, 52640), (7, 97843)]
labels = clustering.labels_.reshape(1069, 736)
count_dict = {}
for i in np.unique(clustering.labels_):
blobs = np.int_(morphology.binary_opening(labels == i))
color = np.around(clustering.cluster_centers_[i])
count = len(np.unique(measure.label(blobs))) - 1
count_dict[i] = [count, color]
#source: https://stackoverflow.com/questions/45043617/count-the-number-of-objects-of-different-colors-in-an-image-in-python/45080346
count_dict
{0: [842, array([ 49., 154., 46.])],
1: [1363, array([233., 228., 225.])],
2: [1047, array([237., 213., 50.])],
3: [2600, array([56., 43., 25.])],
4: [1601, array([176., 40., 37.])],
5: [1872, array([205., 138., 141.])],
6: [571, array([ 28., 142., 189.])],
7: [1176, array([228., 118., 26.])]}
plt.figure(figsize=[10,5])
plt.imshow(rounded_kmeans)
plt.title("Number of candies by color: k-means, $k=8$ ",size=14)
for i in range(8):
plt.text(i,0,count_dict[i][0], color ="black", size=13, ha="center",va="center", fontweight='bold' )